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Abstract 

In this paper we introduce a method, which is used for set separation 
based on quantum computation. In case of no a-priori knowledge about 
the source signal distribution, it is a challenging task to find an optimal 
decision rule which could be implemented in the separating algorithm. We 
lean on the Maximum Likelihood approach and build a bridge between 
this method and quantum counting. The proposed method is also able to 
distinguish between disjunct sets and intersection sets. 



1 Introduction 

In the course of signal and/or data processing fast classification of the input 
data is often helpful as a preprocessing step for decision preparation. Assuming 
that the to be classified data fx £ M is well defined and it came under a given 
number of classes or sets, A := {// G M : A(fx)}, B := {// 6 M : B(fx)}, . . . , Z := 
{/i G M : To perform the classification is in such a way equivalent to a 

set separation task. 

The problem of separation could be manifold: sparsely distributed input 
data makes the determination of the decision lines between the classes to a 
hard (often nonlinear) task, or even the probability distribution of the input 
data is not known a-priori which is resulted in an unsupervised classification 
problem also known as clustering 1 . Further "open question" is to classify 
input sequences in the case of only the original measurement/information data 
is known almost sure, but the observed system adds a stochastically changing 
behavior to it, in this manner the classification becomes a statistical decision 
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problem, which could be extremely hard to solve if the number of " possibilities" 
is increasing. Due to this fact to find an optimal solution is time consuming 
and yields broad ground to suboptimal ones. With assistance of quantum com- 
putation we introduce an optimal solution whose computational complexity is 
much lower contrary to the classical cases. 

This paper is organized as follows. In Sect. [21 the set separation related 
quantum computation basics are highlighted. The system model is described in 
Sect. |3 together with the proposed set separation algorithm in Sect. 0] The 
main achievements are revised in Sect. [S] 

2 Quantum Computation 

In this section we give a brief overview about quantum computation which is 
relevant to this paper. For more detailed description, please, refer to OEIHE]- 
In the classical information theory the smallest information conveying unit 
is the bit. The counterpart unit in quantum information is called the "quantum 
bit", the qubit. Its state can be described by means of the state \cp), \(p) = 
a\0) + (3\1), where a, (3 £ C refers to the complex probability amplitudes and 
M 2 + l/9| 2 = 1 0E|- The expression |a| 2 denotes the probability that after 
measuring the qubit it can be found in computational base |0), and \(3\ 2 shows 
the probability to be in computational base In more general description an 
A^-bit "quantum register" (qregister) \(p) is set up from qubits spanned by \x) 
x = . . . (N — 1) computational bases, where N — 2 n states can be stored in 
the qregisters at the same time |S] 

N—i 

\<P) = ^2 <Px\z)', (fx G C, (1) 

x=0 

where N denotes the number of states and Va; ^ j, (x\j) = 0, (x\x) = 1, 
S l^xl 2 = 1, respectively. It is worth mentioning, that a transformation U on 
a qregister is executed parallel on all N stored states, which is called quantum 
parallelizm. To provide irreversibility of transformation, U must be unitary 
U^ 1 = U\ where the superscript (f ) refers to the Hermitian conjugate or adjoint 
of U . The quantum registers can be set in a general state using quantum gates 
01 |S| which can be represented by means of a unitary operation, described by 
a quadratic matrix. 

3 System Model 

For the sake of simplicity a 2-dimensional set separation is assumed, where the 
original source data can take the values /i £ M^ ' 1 ! and was chosen from the 
sets s = and s = 1. Additional information on the source is not available, 
e.g. also nothing about the probability density function (pfd). The general set 
separation system is depicted in Fig. ^ The observed signal r, disturbed by 
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Figure 1: General set separation system 

the system A, becomes the input data which will be separated into the two sets 
(s = and s = 1) again. 

In the set separator a quantum register \ip) -as described by equation (JTJ and 
shown in Fig. 0- is used to store all the parameters, e.g. delay, heat, velocity, 
etc. values of the possible system disturbance in a specially given quantization 12 . 
As an example: in the qregister \<p), the properly prepared, quantized delay and 
velocity values are stored, e.g. the values 1.0 • 10 _1 , 1.1 • 10 _1 , . . . , 1.0 • 10~ 10 
and 1.0 m/s, 1.1 m/s, ...,100 rajs. This information is not utilizable so far 
but the combination of this effects, i.e. this values, whose extent could blast 
any database. To handle the large amount of data to be processed a virtual 
database should be introduced. 

Definition 3.1 To build up a virtual database a junction 

y = g(s,x), (2) 

is defined, where s G S identifies the sets and x denotes the index of the qregister 
\(f) , respectively. The function yi = g(s,Xi) points to an record in the virtual 
database. 

3.1 Properties of the Function g(-) 

The function g(s,x) is not obligingly mutual unambiguous consequently, it is not 
reversible, except for several special cases, when the virtual database contains 
r = g(s,x) only once. In this case the parameter settings of the system A are 
easy to determine. Nevertheless, the fact to have an entry only once in the 
virtual database described by the equation g(si,x) does not exclude to have the 
same entry in other virtual databases generated by g(sj,x), where i ^ j, which 
makes a trivial decision impossible. Henceforth the fact should be kept in mind 
that g(s,x) is in almost every case a so called one way function which is easy 
to evaluate in one direction, but to estimate the inverse is rather hard. 

The function g(.) generates all the possible disturbances additional to the 
considered input value fi belonging to the set s = or s = 1 of the system. 

1 Quantization is NOT a quantum computation operation! 

2 The quantization method, i.e. linear or nonlinear is out of the scope of this paper. 
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g(s=l,x) 




Figure 2: Quantum register \ip) 



This is of course a large amount of information, 2N = 2 n+1 , where n is the 
length of the qregister \ip). For an example let us assume a 15-qbit qregister. 
The function <?(•) in © generates 2 15 = 32.768 output values at the same time 
for s — and the same number of outputs for s — 1. Taking into account the 
large number of possible points in the set surface the optimal classification in a 
classical way becomes difficult. 

At the first glace this problem looks more difficult to solve, however, with 
exploiting the enormous computational power of quantum computation, in this 
case the Deutsch-Jozsa quantum parallclization algorithm, an arbitrary uni- 
tary operation can be executed on all the prepared states contemporaneously. 

3.2 Quantum Search in Qregister \ip) 

Roughly speaking the task is to find the entry (entries) in the virtual databases 
which is (are) equal to the observed data r. To accomplish the database search 
the Grover database search algorithm should be invoked [EJ. In Sect. we 
proposed to set up an qregister, which has to be built up only one time at 
all. It is obvious to choose a suitable database searching algorithm, to see 
which function go,i(so,i,2l) picking the vector x form qregister \ip) contains the 
searched bit, if any at all. We apply the optimal quantum search algorithm Q, 
as depicted in Fig. proposed by Grover We feed the received signal 

r(t) to the oracle (O), where the function f(r,g(s,x)) is evaluated such that 

fi^) = {\ ' l \l = h (3) 
J v ' ' [0 otherwise. v ; 

Assuming, there is again M solutions for the search in qregister \tp), 
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Figure 3: The Grover database search circuit 
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, , IN S - M , , [W, v 

i^ = V^ |a> + V^ l/3) ' (4) 

where \a) consists of such configurations of \x), which does not results ju = r, 
while does. 

Because of the fact of tight bound, in real application less iterations would 
be also appropriate [TT], 



4 Set Separation 

Let us turn our interest back to the separation of the observed data r from the 
predefined sets. 

Assuming the special case where only one of the virtual database descriptor 
functions, either g(so,x) or g(si,x) contains the entry identical to the observed 
data r a set separation can be performed easily. 

A more realistic case is to have an intersection part of the two sets as shown 
in Fig. Even so, due to passing the observed system, overlapping of the sets 
can be occurred due to disturbances. After evaluating the functions 9o,i( s o,ij^) 
it could happen that the same records are multiple present, which shows the ir- 
reversibility behavior of the function (J2J) . Originally, the input signal was chosen 
from well defined disjunct sets without a-priori known probability distributions. 
The process, to put r to a set either to so or to s\ should be based on Maximum 
Likelihood decision. 

Let us assume that we have a random variable r. Its measured value depends 
on a selected element x\ from a finite set (I = 1, . . . , L) and a process which can 
be characterized by means of a conditional pdf f(r\xi) belonging to the given 
element. Our task is to decide which xi was selected if a certain r has been 
measured. Each guess Hi for xi can be regarded as a hypothesis. Therefore 
decision theory is dealing with design and analysis of suitable rules building 
connections between the set of observations and hypotheses. 

If we are familiar with the unconditional (a priori) probabilities P{xi) then 
the Bayes formula helps us to compute the conditional (a posteriori) probabili- 
ties P(Hi\r) in the following way 

P(m\r)= . 

Ef=i/(rk)P(zi) 

Obviously the most pragmatic solution if one chooses Hi belonging to the largest 
P(Hi\r). This type of hypothesis testing is called maximum a posteriori (MAP) 
decision. 

If the a priori probabilities are unknown or xi is equiprobable then maximum 
likelihood (ML) decision can be used. It selects Hi resulting the largest f(r\xi) 
when the observed r is substituted in order to minimize the probability of error 



max L(r, x{). 
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Figure 4: Sets with intersection 
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Figure 5: The two density functions / (r\s — 0) and / (r\s = 1) 

The Maximum Likelihood estimator requires to know the probability density 
function of the observed signal. Employing the Grover database search algo- 
rithm we are able to find the entries in the virtual databases, however, it is not 
needed to perform a complete search because the search result -the exact index 
(indices) of the searched item(s)- is (are) not interesting but the number how 
often a given configuration is involved in g(s,x) or not. For that purpose a new 
function /(•) is defined. 

Definition 4.1 The function 



counts the number of similar entries in the virtual database, which corresponds 
to the conditional probability density function r to be in the set s. 

For that reason it is worth stepping forward to quantum counting |12j based on 
Grover iteration. 

4.1 Set Separation Method 

The both curves in Fig.0 represent the number of the same entries in the virtual 
databases, i.e. the pdf's, according to f(r\s = 0) and f(r\s — 1), respectively. 



f(r\s) 



H(x :r = g(s,x)) 



(5) 
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In case of having entry(entries) only in yi but not in yj of function go,i(so,i,x), 
where i, j £ [0, 1], and i ^ j, means a 100 percent sure decision, following the 
decision rules in Table ^ This areas are the non-overlapping parts of the sets 
in Fig. and the outer parts (until the vertical dashed black lines) in Fig. |3J 
However, in the case of non zero f(r\s = 0) and f(r\s = 1) values an accurate 
prediction can be given relating to the Maximum Likelihood decision rule. 



Table 1: Set Separation Decision Rules 



f(r\s ) 


f(r\si) 


Decision 










\tp) was badly prepared 









r belongs to set s = 1 









r belongs to set s = 
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r belongs to set s = 




< 


r belongs to set s = 1 



All the possible states from the qregister \tp) will be evaluated by the function 
for s = and also for s = 1, simultaneously, which will be collated with the 
system output r. If at least one output yo or y% with the parameter settings 
x is matched to the system output r, it will be put to the set s = or s = 1, 
respectively. In a more exciting case at least one similarity of yo an d a ls° at 
least one of y\ to r is given, the system output could be classified to the both 
sets, an intersection is drawn up. This result in a not certainty prediction, which 
piques our interest and sets our focus not this juncture. 

We assume no a-priori knowledge on the probability distribution of the input 
sequence so it is assumed to be equally distributed. Henceforward we suppose 
that after counting the evaluated values f(r\s = i) the number of similarity to 
the system output r is higher than in case of f(r\s = j), where i,j £ [0, 1]. In 
pursuance of the decision rule in Table ^ > r belongs rather to set s = i than to 
set s = j. 

The Method To perform a set separation nothing else is required as 

1. Prepare the qregister \ip), 

2. Evaluate the functions yi — gi = (s = i,x), where i G [0,1] in 2- 
dimensional case, 

3. Count the identical entries in the virtual databases which are equal to the 
observed data r, f(r\s), (see Fig. EJ-, 

4. Use the decision table Table to assign r to the sets s = or s = 1. 



7 



5 Concluding Remarks 



In this paper we showed a connection between Maximum Likelihood hypothesis 
testing and Quantum Counting used for quantum set separation. We introduced 
a set separation algorithm based on quantum counting which was employed 
to estimate the conditional probability density function of the observed data 
in consideration to the belonging sets. In our case the pdf's are estimated 
fully at a single point by invoking the quantum counting operation only once, 
that makes the decision facile and sure. In addition one should keep in mind 
that the qregister \<p) have to be set up only once before the separation. The 
virtual databases are generated once and directly leaded to the Oracle of the 
Grover block in the quantum counting circuite, which reduce the computational 
complexity, substantially. 
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